Ch.16 Polynomials of Matrices

Return to TOC

Polynomials of Maps and Matrices

Self-Composition

For a linear transformation t:VVt:V\to V, the domain and codomain is the same, so we can compose tt with itself: t2=ttt^2=t\circ t and t3=tttt^3=t\circ t\circ t. In general, ti+j=titjt^{i+j}=t^i\circ t^j
Note that the exponent notation matches that of matrices, with RepB,B(tj)=Tj\text{Rep}_{B,B}(t^j)=T^j

Let tt be a linear transformation on VV. Where f(x)=cnxn++c1x+c0f(x)=c_nx^n+\cdots+c_1x+c_0 is a polynomial, f(t)=cntn++c1t+c0idf(t)=c_nt^n+\cdots+c_1t+c_0\text{id} is a transformation on VV. Similarly, if TT is a square matrix then f(T)=cnTn++c1T+c0If(T)=c_nT^n+\cdots+c_1T+c_0I

We must prove the following:
Let f(x)=cnxn++c1x+c0f(x)=c_nx^n+\cdots+c_1x+c_0 and g(x)=dmxm++d1x+c0g(x)=d_mx^m+\cdots+d_1x+c_0 be two polynomials and tt be a linear transformation on VV. If s(x)=f(x)+g(x)s(x)=f(x)+g(x) is the sum of f(x)f(x) and g(x)g(x) and p(x)=f(x)g(x)p(x)=f(x)g(x) is the product, then s(t)=f(t)+g(t)s(t)=f(t)+g(t) and p(t)=f(t)g(t)p(t)=f(t)\circ g(t)

Proof

The addition case is simple. To prove the product, for some vV\vec{v}\in V write f(t)g(t)(v)=f(t)(g(t)(v))=(cntn++c1t+c0idV)(g(t)(v))=cntn(g(t)(v))++c1t(g(t)(v))+c0(g(t)(v))f(t)\circ g(t)(\vec{v})=f(t)(g(t)(\vec{v}))=(c_nt^n+\cdots+c_1t+c_0\text{id}_V)(g(t)(\vec{v}))\\=c_nt^n(g(t)(\vec{v}))+\cdots+c_1t(g(t)(\vec{v}))+c_0(g(t)(\vec{v}))
From g(x)g(x) and the fact that tit^i is a linear map,
citi(g(t)(v))=citi(dmtm(v)++d1t(v)+d0(v))=cidm(ti+m(v))+cid1(t1+i(v))+cid0(ti(v))\begin{array}{rcl}c_it^i(g(t)(\vec{v}))&=&c_it^i(d_mt^m(\vec{v})+\cdots+d_1t(\vec{v})+d_0(\vec{v}))\\&=&c_id_m(t^{i+m}(\vec{v}))+c_id_1(t^{1+i}(\vec{v}))+c_id_0(t^{i}(\vec{v}))\\\end{array}
This is simply pi(t)p_i(t), where pi(x)=cidmxi+m++cid1xi+1+cid0xi=cixi(dmxm++d1x+d0)=cixig(x)p_i(x)=c_id_mx^{i+m}+\cdots+c_id_1x^{i+1}+c_id_0x^i=c_ix^i(d_mx^m+\cdots+d_1x+d_0)=c_ix^ig(x)
Thus, from above,
f(t)g(t)(v)=pn(t)(v)++p1(t)(v)+p0(t)(v)f(t)\circ g(t)(\vec{v})=p_n(t)(\vec{v})+\cdots+p_1(t)(\vec{v})+p_0(t)(\vec{v})
From the sum case, we obtain f(t)g(t)(v)=S(t)(v)f(t)\circ g(t)(\vec{v})=S(t)(\vec{v}) where S(x)S(x) is
f(t)g(t)(v)=cnxng(x)++c1xg(x)+c0g(x)=(cnxn++c1x+c0)g(x)=f(x)g(x)=p(x)\begin{array}{rcl}f(t)\circ g(t)(\vec{v})&=&c_nx^ng(x)+\cdots+c_1xg(x)+c_0g(x)\\&=&(c_nx^n+\cdots+c_1x+c_0)g(x)\\&=&f(x)g(x)\\&=&p(x)\end{array}
Hence, f(t)g(t)(v)=p(t)(v)f(t)\circ g(t)(\vec{v})=p(t)(\vec{v}) for all vV\vec{v}\in V, proving the result.

Thus, we have f(t)g(t)=g(t)f(t)f(t)\circ g(t)=g(t)\circ f(t). Furthermore, the range space R(f(t))\mathscr{R}(f(t)) and null space N(f(t))\mathscr{N}(f(t)) are stable (or invariant) by g(t)g(t), i.e. g(t)(R(f(t)))R(f(t))g(t)(\mathscr{R}(f(t)))\subseteq\mathscr{R}(f(t)) and g(t)(N(f(t)))N(f(t))g(t)(\mathscr{N}(f(t)))\subseteq\mathscr{N}(f(t))

Proof

If vR(f(t))\vec{v}\in\mathscr{R}(f(t)), then we can find a wV\vec{w}\in V such that v=f(t)(w)\vec{v}=f(t)(\vec{w}). Hence, g(t)(v)=g(t)(f(t)(w))=f(t)(g(t)(w))R(f(t))g(t)(\vec{v})=g(t)(f(t)(\vec{w}))=f(t)(g(t)(\vec{w}))\in\mathscr{R}(f(t)).
Similarly, if vN(f(t))\vec{v}\in\mathscr{N}(f(t)), then f(t)(g(t)(v))=g(t)(f(t)(v))=g(t)(0)=0f(t)(g(t)(\vec{v}))=g(t)(f(t)(\vec{v}))=g(t)(\vec{0})=\vec{0}.

Let tt be a linear transformation on vector space VV. Suppose v\vec{v} is an eigenvector associated with eigenvalue λ\lambda of tt. For every polynomial f(x)f(x), v\vec{v} is an eigenvector associated with eigenvalue f(λ)f(\lambda) of f(t)f(t)

Proof

We have ti(v)=λivt^i(\vec{v})=\lambda^i\vec{v}. Therefore, if f(x)=cnxn++c1x+c0f(x)=c_nx^n+\cdots+c_1x+c_0, then f(t)(v)=cntn(v)++c1t(v)+c0=cnλn(v)++c1λ(v)+c0=f(λ)(v)f(t)(\vec{v})=c_nt^n(\vec{v})+\cdots+c_1t(\vec{v})+c_0=c_n\lambda^n(\vec{v})+\cdots+c_1\lambda(\vec{v})+c_0=f(\lambda)(\vec{v})


Minimal Polynomial

We now aim to study the set of polynomials p(x)p(x) such that for a given linear transformation t:vVt:v\to V (or square matrix TT), p(t)=0p(t)=0, the zero map (or p(T)=0p(T)=0, the zero matrix).

If TT is a square matrix, we can find a non-zero polynomial p(x)p(x) such that p(T)=0p(T)=0. Similarly, if t:VVt:V\to V is a linear transformation, we can find a non-zero polynomial p(x)p(x) such that p(t)=0p(t)=0.

Proof

It suffices to prove the matrix case.
Since the M2×2\mathcal{M}_{2\times 2} vector space has dimension n2n^2, the n2+1n^2+1 member set {I,T,T2,...,Tn2}\{I, T, T^2,...,T^{n^2}\} must be linearly dependent. Thus, there exists scalars c0,...,cn2c_0,...,c_{n^2} not all zero such that cn2Tn2++c1T+c0I=0c_{n^2}T^{n^2}+\cdots+c_1T+c_0I=0. Hence, the polynomial p(x)=cn2xn2++c1x+c0p(x)=c_{n^2}x^{n^2}+\cdots+c_1x+c_0 is a non-zero polynomial for which p(T)=0p(T)=0.

From the proof above, we see that a polynomial of degree n2n^2 always suffices, but sometimes a smaller-degree polynomial is enough.

Example 16.1

Consider the matrix
T=(cos(π/3)sin(π/3)sin(π/3)cos(π/3))=(1/23/23/21/2)T=\begin{pmatrix}\cos{(\pi/3)}&-\sin{(\pi/3)}\\\sin{(\pi/3)}&\cos{(\pi/3)}\end{pmatrix}=\begin{pmatrix}1/2&-\sqrt{3}/2\\\sqrt{3}/2&1/2\end{pmatrix}
The polynomial p(x)=x2x+1=0p(x)=x^2-x+1=0 satisfies p(T)=0p(T)=0

If f(x)f(x) is a polynomial that takes a matrix TT (or linear map tt) to zero, then every eigenvalue of TT (or linear map tt) is a root of f(x)f(x).

Proof

Let v\vec{v} be an eigenvector associated with eigenvalue λ\lambda of tt. Then f(t)(v)=f(λ)(v)f(t)(\vec{v})=f(\lambda)(\vec{v}). Since f(t)=0f(t)=0, we have f(λ)(v)=0f(λ)=0f(\lambda)(\vec{v})=0\implies f(\lambda)=0 because v0\vec{v}\ne0 as eigenvectors are not 0\vec{0}

The minimal polynomialm(x)m(x) of a transformation tt or square matrix TT is the non-zero polynomial of least degree and leading coefficient 11 such that m(t)=0m(t)=0 or m(T)=0m(T)=0.

Since the leading coefficient must be 11, m(x)m(x) cannot be a constant; i.e. it must have degree at least one.

Example 16.2

The zero matrix has minimal polynomial m(x)=xm(x)=x.
The identity matrix has minimal polynomial m(x)=x1m(x)=x-1

Any transformation/square matrix has a unique minimal polynomial.

Proof

First, we prove existence.
Since a polynomial exists that takes a map or polynomial to zero, we take the one with smallest degree. Divide by its leading coefficient to make it 11. This satisfies the conditions, so a minimal polynomial exists.
Next, we prove uniqueness.
Suppose both m(x)m(x) and m^(x)\hat{m}(x) are two polynomials that take the map or matrix to zero, are both minimal and equal degree, and have leading coefficient 11. Consider the difference m(x)m^(x)m(x)-\hat{m}(x). If this is not zero, then it has a nonzero leading coefficient. Dividing by that coefficient gives a polynomial that takes a map or matrix to zero, has leading coefficient 11, and is smaller degree than mm and m^\hat{m}. This contradicts the minimality of mm and m^\hat{m}, thus m(x)m^(x)=0m(x)-\hat{m}(x)=0, making the two equal.

We strengthen the earlier claim that the eigenvalues are roots of f(x)f(x).
The roots of the minimal polynomial are exactly the eigenvalues of a square matrix (or linear map)

Proof

Let m(x)m(x) be the minimal polynomial of the linear map t:VVt:V\to V. Since m(t)=0m(t)=0, all eigenvalues are roots of m(x)m(x). It remains to show that every root of m(x)m(x) is an eigenvalue of tt. If rr is a root of m(x)m(x), then we can write m(x)=(xr)p(x)m(x)=(x-r)p(x) for a polynomial p(x)p(x) with smaller degree than m(x)m(x). Since m(x)m(x) is the minimal polynomial of tt, p(t)p(t) is not the zero map. Therefore, there is a vector vV\vec{v}\in V such that p(t)(v)0p(t)(\vec{v})\ne\vec{0}. Using this, we have m(t)(v)=(tr idV)(p(x)(v))m(t)(\vec{v})=(t-r\text{ id}_V)(p(x)(\vec{v})). Using m(t)=0m(t)=0 and distributing, t(p(t)(v))rp(t)(v)=0t(p(t)(v))=rp(t)(v)t(p(t)(\vec{v}))-rp(t)(\vec{v})=0\implies t(p(t)(\vec{v}))=rp(t)(\vec{v}) which means rr is an eigenvalue of tt with eigenvector p(t)(v)p(t)(\vec{v})

For a polynomial f(x)f(x), if f(T)f(T) is the zero matrix then f(x)f(x) is divisible by the minimal polynomial of TT.

Proof

Let m(x)m(x) be the minimal polynomial of TT. Then according to the division theorem for polynomials, f(x)=q(x)m(x)+r(x)f(x)=q(x)\cdot m(x)+r(x), with r(x)r(x) having degree strictly less than m(x)m(x). Since TT satisfies both ff and mm, m(T)=f(T)=0m(T)=f(T)=0, so r(T)r(T) must also be the zero map. This would contradict the minimality of mm unless rr was the zero polynomial.

Cayley-Hamilton Theorem

If TT is a square matrix (or tt a linear map) with characteristic polynomial c(x)c(x), then c(T)c(T) is the zero matrix (or c(t)c(t) the zero map). In particular, the minimal polynomial m(x)m(x) of TT divides its characteristic poylnomial.

Proof

Let C=TxIC=T-xI, the matrix whose determinant is the characteristic polynomial c(x)=cnxn++c1x+c0c(x)=c_nx^n+\cdots+c_1x+c_0. Since the product of a matrix and its adjoint is its determinant times the identity, we can write c(x)I=adj(C)C=adj(C)(TxI)=adj(C)Tadj(C)xc(x)\cdot I=\text{adj}(C)C=\text{adj}(C)(T-xI)=\text{adj}(C)T-\text{adj}(C)\cdot x
The left side is cnIxn++c1Ix+c0Ic_nIx^n+\cdots+c_1Ix+c_0I. For the right, adj(C)\text{adj}(C) is a matrix of polynomials, therefore it can be expressed as a polynomial with matrix coefficients: adj(C)=Cn1xn1++C1x+C0\text{adj}(C)=C_{n-1}x^{n-1}+\cdots+C_1x+C_0, with each CiC_i being a matrix of scalars, making the right side [(Cn1T)xn1++(C1T)x+C0T][Cn1xn++C1x2+C0x][(C_{n-1}T)x^{n-1}+\cdots+(C_1T)x+C_0T]-[C_{n-1}x^n+\cdots+C_1x^2+C_0x]
Equate the coefficients on each side:
cnI=Cn1cn1I=Cn1TCn2c1I=C1TC0c0I=C0Tc_nI=-C_{n-1}\\c_{n-1}I=C_{n-1}T-C_{n-2}\\\vdots\\c_1I=C_1T-C_0\\c_0I=C_0T
Multilply the first equation by TnT^n, the second equation by Tn1T^{n-1}, etc.
cnTn=Cn1Tncn1Tn1=Cn1TnCn2Tn1c1T=C1T2C0Tc0I=C0Tc_nT^n=-C_{n-1}T^n\\c_{n-1}T^{n-1}=C_{n-1}T^n-C_{n-2}T^{n-1}\\\vdots\\c_1T=C_1T^2-C_0T\\c_0I=C_0T
Adding all the equations, many cancel out, giving
cnTn+cn1Tn1++c0I=c(T)=0c_nT^n+c_{n-1}T^{n-1}+\cdots+c_0I=c(T)=0

Thus, if λ1,λ2,...,λl\lambda_1,\lambda_2,...,\lambda_l are eigenvalues of a linear transformation of an nn-dimensional vector or square n×nn\times n matrix, then the characteristic polynomial factors to
c(x)=(1)n(xλ1)p1(xλ2)p2(xλl)plc(x)=(-1)^n(x-\lambda_1)^{p_1}(x-\lambda_2)^{p_2}\cdots(x-\lambda_l)^{p_l}
and its minimal polynomial factors into
m(x)=(xλ1)q1(xλ2)q2(xλl)qlm(x)=(x-\lambda_1)^{q_1}(x-\lambda_2)^{q_2}\cdots(x-\lambda_l)^{q_l}
where 1qipi1\le q_i\le p_i for each i{1,...,l}i\in\{1,...,l\}

Example 16.3

Find the minimal polynomial of the following matrix
T=(2001120200210001)T=\begin{pmatrix}2&0&0&1\\1&2&0&2\\0&0&2&-1\\0&0&0&1\end{pmatrix}


The characteristic polynomial is easily computed as c(x)=(x1)(x2)3c(x)=(x-1)(x-2)^3. Thus, by the Cayley-Hamilton Theorem, the minimal polynomial is either (x1)(x2)(x-1)(x-2), (x1)(x2)2(x-1)(x-2)^2, or (x1)(x2)3(x-1)(x-2)^3. We compute to check:
(TI)(T2I)=(1001110200110000)(0001100200010001)=(0000100100000000)(T-I)(T-2I)=\begin{pmatrix}1&0&0&1\\1&1&0&2\\0&0&1&-1\\0&0&0&0\end{pmatrix}\begin{pmatrix}0&0&0&1\\1&0&0&2\\0&0&0&-1\\0&0&0&-1\end{pmatrix}=\begin{pmatrix}0&0&0&0\\1&0&0&1\\0&0&0&0\\0&0&0&0\end{pmatrix}
(TI)(T2I)2=(0000100100000000)(0001100200010001)=(0000000000000000)(T-I)(T-2I)^2=\begin{pmatrix}0&0&0&0\\1&0&0&1\\0&0&0&0\\0&0&0&0\end{pmatrix}\begin{pmatrix}0&0&0&1\\1&0&0&2\\0&0&0&-1\\0&0&0&-1\end{pmatrix}=\begin{pmatrix}0&0&0&0\\0&0&0&0\\0&0&0&0\\0&0&0&0\end{pmatrix}
So m(x)=(x1)(x2)2m(x)=(x-1)(x-2)^2

A square matrix of a linear map is diagonalizable iff its minimal polynomial m(x)m(x) has simple roots, i.e. m(x)=(xλ1)(xλ2)(xλl)m(x)=(x-\lambda_1)(x-\lambda_2)\cdots(x-\lambda_l) where each λi\lambda_i is distinct.

Proof

Call λ1,...,λl\lambda_1,...,\lambda_l the distinct roots of the characteristic polynomial of the linear map tt on the nn-dimensional vector space VV. Note that λ1,...,λl\lambda_1,...,\lambda_l are the distinct roots of the minimal polynomial m(x)m(x) of tt.
Suppose tt is diagonalizable. Then there exists a basis B=v1,...,vnB=\langle\vec{v}_1,...,\vec{v}_n\rangle with each vi\vec{v}_i an eigenvector of tt. Consider the polynomial p(x)=(xλ1)(xλl)p(x)=(x-\lambda_1)\cdots(x-\lambda_l). We claim p(t)=0p(t)=0. It suffices to check p(t)(vi)=0p(t)(\vec{v}_i)=0 for all vi\vec{v}_i. We know p(t)(vi)=p(λi)(vi)p(t)(\vec{v}_i)=p(\lambda_i)(\vec{v}_i), but p(λi)=(λiλ1)(λiλl)=0p(\lambda_i)=(\lambda_i-\lambda_1)\cdots(\lambda_i-\lambda_l)=0 since λi{λ1,...,λl}\lambda_i\in\{\lambda_1,...,\lambda_l\}, thus p(t)(vi)=0p(t)(\vec{v}_i)=0. Thus, the minimal polynomial m(x)m(x) must divide p(x)p(x). Therefore, m(x)m(x) also has simple roots, and is in fact equal to p(x)p(x) since λ1,...,λl\lambda_1,...,\lambda_l are the distinct roots of m(x)m(x). Thus, tt is diagonalizable m(x)\implies m(x) has simple roots.
To prove the converse, we perform induction on the number of eigenvalues ll, which is the dimension of m(x)m(x).
If TT has only one eigenvalue, then m(x)=xλ1m(t)=tλ1 idV=0t=λ1 idm(x)=x-\lambda_1\implies m(t)=t-\lambda_1\text{ id}_V=0\implies t=\lambda_1\text{ id} meaning it is diagonal.
Let m(x)=(xλ1)(xλl)m(x)=(x-\lambda_1)\cdots(x-\lambda_l). Let p(x)=(xλ2)(xλl)p(x)=(x-\lambda_2)\cdots(x-\lambda_l) so that m(x)=(xλ1)p(x)m(x)=(x-\lambda_1)p(x). Suppose the statement is true if the minimal polynomial has degree less than ll. Call V^=N(p(t))\hat{V}=\mathscr{N}(p(t)). Since t(V^)V^t(\hat{V})\subseteq\hat{V}, we can define a linear map t^:V^V^\hat{t}:\hat{V}\to\hat{V} by t^(v)=t(v)\hat{t}(\vec{v})=t(\vec{v}) for vV^\vec{v}\in\hat{V}. Therefore, if m^(x)\hat{m}(x) is the minimal polynomial for t^\hat{t}, m^(x)\hat{m}(x) divides p(x)p(x), making the roots simple. Since there are at most l1l-1 roots thats are all simple, we can find a basis w1,...,wk\langle\vec{w}_1,...,\vec{w}_{k}\rangle of V^\hat{V} consisting of eigenvectors wj\vec{w}_j of t^\hat{t} (hence of tt) with associated eigenvalues μj{λ2,...,λl}\mu_j\in\{\lambda_2,...,\lambda_l\}.
Since V^=N(p(t))\hat{V}=\mathscr{N}(p(t)) has kk vectors in the basis, the nullity of p(t)p(t) is kk. Hence, k+rank(p(t))=nk+\text{rank}(p(t))=n, the dimension of VV.
Next, we show that R(p(t))N(tλ1 idV)\mathscr{R}(p(t))\subseteq\mathscr{N}(t-\lambda_1\text{ id}_V). If vR(p(t))\vec{v}\in\mathscr{R}(p(t)), we can find a wV\vec{w}\in V such that v=p(t)(w)(tλ1idV)(v)=(tλ1idV)(p(t)(w))\vec{v}=p(t)(\vec{w})\implies(t-\lambda_1\text{id}_V)(\vec{v})=(t-\lambda_1\text{id}_V)(p(t)(\vec{w})), but since (xλ1)p(x)=m(x)(x-\lambda_1)p(x)=m(x), (tλ1idV)(v)=m(t)(w)=0(t-\lambda_1\text{id}_V)(\vec{v})=m(t)(\vec{w})=\vec{0}, completing the proof.
If mm was the nullity of (tλ1 idV)(t-\lambda_1\text{ id}_V), we must have m+krank(p(t))+k=nm+k\ge\text{rank}(p(t))+k=n. Pick a basis v1,...,vm\langle\vec{v}_1,...,\vec{v}_m\rangle. Each vi\vec{v}_i is an eigenvector of tt associated with eigenvalue λ1\lambda_1.
We claim v1,...,vm,w1,...,wk\langle\vec{v}_1,...,\vec{v}_m,\vec{w}_1,...,\vec{w}_k\rangle is a basis of VV. Since m+knm+k\ge n, it suffices to show these vectors are linearly independent. Suppose we can find scalars α1,...,αm,β1,...,βk\alpha_1,...,\alpha_m,\beta_1,...,\beta_k such that
α1v1++αmvm+β1w1++βkwk=0\alpha_1\vec{v}_1+\cdots+\alpha_m\vec{v}_m+\beta_1\vec{w}_1+\cdots+\beta_k\vec{w}_k=0
Applying the map (tλ1idV)(t-\lambda_1\text{id}_V) to the above equation and considering that (tλ1 idV)(vi)=0(t-\lambda_1\text{ id}_V)(\vec{v}_i)=\vec{0} and (tλ1 idV)(wj)=(μjλ1)wj(t-\lambda_1\text{ id}_V)(\vec{w}_j)=(\mu_j-\lambda_1)\vec{w}_j, we simplify to
β1(μ1λ1)w1++βk(μkλ1)wk=0\beta_1(\mu_1-\lambda_1)\vec{w}_1+\cdots+\beta_k(\mu_k-\lambda_1)\vec{w}_k=\vec{0}
Since w1,...,wk\langle\vec{w}_1,...,\vec{w}_k\rangle is a basis of VV, this implies βj(μjλ1)=0\beta_j(\mu_j-\lambda_1)=0 for all j=1,...,kj=1,...,k. But μjλ10\mu_j-\lambda_1\ne0 since μj{λ2,...,λl}\mu_j\in\{\lambda_2,...,\lambda_l\} and all the λi\lambda_i's are distinct. Therefore, all of the βj=0\beta_j=0, reducing the linear dependence relation to
α1v1++αmvm=0\alpha_1\vec{v}_1+\cdots+\alpha_m\vec{v}_m=0
Since v1,...,vm\langle\vec{v}_1,...,\vec{v}_m\rangle is a basis for N(tλ1 idV)\mathscr{N}(t-\lambda_1\text{ id}_V), we conclude αi=0\alpha_i=0 for all i=1,...,mi=1,...,m. Therefore, v1,...,vm,w1,...,wk\langle\vec{v}_1,...,\vec{v}_m,\vec{w}_1,...,\vec{w}_k is a basis of VV consisting of eigenvectors of tt. Hence, tt is diagonalizable.


Powers of Transformations

For transformations t:VVt:V\to V we can consider powers, as mentioned at the beginning of this chapter.

Example 16.4

The derivative map d/dx:P3P3d/dx:\mathcal{P}_3\to\mathcal{P}_3, we have
a+bx+cx2+dx3d/dxb+2cx+3dx2a+bx+cx^2+dx^3\xmapsto{d/dx}b+2cx+3dx^2
a+bx+cx2+dx3d2/dx22c+6dxa+bx+cx^2+dx^3\xmapsto{d^2/dx^2}2c+6dx
a+bx+cx2+dx3d3/dx36da+bx+cx^2+dx^3\xmapsto{d^3/dx^3}6d
and any higher power maps to 00.

For any transformation V:ttV:t\to t, the range spaces of the powers form a descending chain:
VR(t)R(t2)V\supseteq\mathscr{R}(t)\supseteq\mathscr{R}(t^2)\supseteq\cdots
and the null spaces form an ascending chain:
{0}N(t)N(t2)\{\vec{0}\}\subseteq\mathscr{N}(t)\subseteq\mathscr{N}(t^2)\subseteq\cdots
Further, there is a k>0k>0 such that for powers less than kk the subsets are proper; i.e. if j<kj<k then R(tj)R(tj+1)\mathscr{R}(t^j)\supset\mathscr{R}(t^{j+1}) and N(tj)N(tj+1)\mathscr{N}(t^j)\subset\mathscr{N}(t^{j+1}), while is jkj\ge k then R(tj)=R(tj+1)\mathscr{R}(t^j)=\mathscr{R}(t^{j+1}) and N(tj)=N(tj+1)\mathscr{N}(t^j)=\mathscr{N}(t^{j+1})
(k=1k=1 can happen if, for example, tt is invertible, as none of the subsets in the chain would be a proper subset.)

Proof

If wR(tj+1)\vec{w}\in\mathscr{R}(t^{j+1}), so that w=tj+1(v)\vec{w}=t^{j+1}(\vec{v}) for some v\vec{v}, then w=tj(t(v))\vec{w}=t^j(t(\vec{v})), so wR(tj)\vec{w}\in\mathscr{R}(t^j). So R(tj+1)R(tj)\mathscr{R}(t^{j+1})\subseteq\mathscr{R}(t^j). If R(tk)=R(tk+1)\mathscr{R}(t^k)=\mathscr{R}(t^{k+1}), then R(tk+1)=R(tk+2)\mathscr{R}(t^{k+1})=\mathscr{R}(t^{k+2}) by induction, since we have the same map with the same domain and same range. Also, because VV is finite-dimensional and proper subsets must have dimension strictly less than its superset, it must eventually stop.
Finally, by rank-nullity theorem, the opposite is true of the nullspace.

Example 16.5

For the derivative map d/dx:P3P3d/dx:\mathcal{P}_3\to\mathcal{P}_3 from before, we have the following chain of range spaces:
P3P2P1P0{0}\mathcal{P}_3\supset\mathcal{P}_2\supset\mathcal{P}_1\supset\mathcal{P}_0\supset\{\vec{0}\}
and the following chain of null spaces:
{0}P0P1P2P3\{\vec{0}\}\subset\mathcal{P}_0\subset\mathcal{P}_1\subset\mathcal{P}_2\subset\mathcal{P}_3

Let tt be a transformation on an nn-dimensional space.
The generalized range space (or closure of the range space) is R(t)=R(tn)\mathscr{R}_\infty(t)=\mathscr{R}(t^n).
The generalized null space (or closure of the null space) is N(t)=N(tn)\mathscr{N}_\infty(t)=\mathscr{N}(t^n)

Essentially, this is the point where the range space and null space stabilize, which is at most after nn iterations.

As jj increases, the dimensions of R(tj)\mathscr{R}(t^j)s fall while the dimensions of the N(tj)\mathscr{N}(t^j)s rise, so that VV is split among them.
For any linear t:VVt:V\to V, the function t:R(t)R(t)t:\mathscr{R}_\infty(t)\to\mathscr{R}_\infty(t) is bijective. Therefore R(t)N(t)={0}\mathscr{R}_\infty(t)\cap\mathscr{N}_\infty(t)=\{\vec{0}\}

Proof

Let the dimension of VV be nn. Because R(tn)=R(tn+1)=t(R(tn))\mathscr{R}(t^n)=\mathscr{R}(t^{n+1})=t(\mathscr{R}(t^n)), the map t:R(t)R(t)t:\mathscr{R}_\infty(t)\to\mathscr{R}_\infty(t) is onto. Therefore, it is one-to-one.
Next, assume vR(t)N(t)\vec{v}\in\mathscr{R}_\infty(t)\cap\mathscr{N}_\infty(t). Since v\vec{v} is in the generalized null space, tn(v)=0t^n(\vec{v})=\vec{0}. On the other hand, since tt is one-to-one and a composition of one-to-one functions is also one-to-one, tnt^n is also one-to-one. Only 0\vec{0} maps to 0\vec{0} in a one-to-one linear map so tn(v)=0t^n(\vec{v})=\vec{0} implies v=0\vec{v}=\vec{0}.

As a result, the union R(t)N(t)\mathscr{R}_\infty(t)\cup\mathscr{N}_\infty(t) spans VV. In fact, if v1,...,vk\langle\vec{v}_1,...,\vec{v}_k\rangle is a basis of R(t)\mathscr{R}_\infty(t) and w1,...,wl\langle\vec{w}_1,...,\vec{w}_l\rangle is a basis of N(t)\mathscr{N}_\infty(t), then v1,...,vk,w1,...,wl\langle\vec{v}_1,...,\vec{v}_k,\vec{w}_1,...,\vec{w}_l\rangle is a basis of VV. Moreover the block diagonal matrix of tt is (SZZN)\begin{pmatrix}S&Z\\Z&N\end{pmatrix} where ZZ is zeroes, SMk×kS\in\mathcal{M}_{k\times k} is invertible and NMl×lN\in\mathcal{M}_{l\times l} satisfies Nl=0N^l=0
Proof

Let the dimension of VV be nn. By rank-nullity, k+l=nk+l=n, so B=v1,...,vk,w1,...,wlB=\langle\vec{v}_1,...,\vec{v}_k,\vec{w}_1,...,\vec{w}_l\rangle has nn-vectors. To show BB is a basis of VV, it suffices to show that this set is linearly independent. But this follows from the fact that R(t)N(t)={0}\mathscr{R}_\infty(t)\cap\mathscr{N}_\infty(t)=\{\vec{0}\}

For powers between j=0j=0 and j=nj=n, the intersection between the range space and null space may be nontrivial.
Consider the shift mapn:C2C2n:\mathbb{C}^2\to\mathbb{C}^2 defined by
(xy)(0x)\begin{pmatrix}x\\y\end{pmatrix}\mapsto\begin{pmatrix}0\\x\end{pmatrix}
On the basis, this map's action gives a string
(10)(01)(00)  that is  e1e20\begin{pmatrix}1\\0\end{pmatrix}\mapsto\begin{pmatrix}0\\1\end{pmatrix}\mapsto\begin{pmatrix}0\\0\end{pmatrix}\space\space\text{that is}\space\space\vec{e}_1\mapsto\vec{e}_2\mapsto\vec{0}
Notice how e2=(01)\vec{e}_2=\begin{pmatrix}0\\1\end{pmatrix} is both in the range space and null space. Also observe that though nn is not the zero map, the functio n2=nnn^2=n\circ n is the zero map. This will be explored next.